The opticlust algorithm yielded high quality OTUs based on the high precision metrics below.
| Parameter | Value |
|---|---|
| cutoff | 0.2 |
| Sensitivity | 0.998 |
| Specificity | 0.999 |
| PPV | 0.998 |
| NPV | 0.999 |
| Accuracy | 0.999 |
| MCC | 0.997 |
| F1score | 0.998 |
| FDR | 0.002 |
The taxonomy assignment is based on the phylotype method in mothur platform. Only taxon observed more than once are displayed.
Figure x: Unfiltered and curated OTU abundance. Visual representation of taxon terms highlight the most abundant taxon based on frequency of being assigned to an OTU or tree nodes. Muribaculaceae is the most frequently assigned family and Muribaculaceae_ge is the most frequent species assigned to most sequences.
OTUs
[1] "Otu001" "Otu002" "Otu003" "Otu004" "Otu005" "Otu006" "Otu007"
[8] "Otu008" "Otu009" "Otu010" "Otu012" "Otu013" "Otu014" "Otu015"
[15] "Otu016" "Otu017" "Otu020" "Otu023" "Otu027" "Otu032" "Otu036"
[22] "Otu038" "Otu042" "Otu045"
Phylum
[1] "Bacteroidetes" "Firmicutes"
Class
[1] "Bacilli" "Bacteroidia" "Clostridia"
Order
[1] "Bacteroidales" "Bifidobacteriales" "Clostridiales"
[4] "Enterobacteriales" "Erysipelotrichales" "Lactobacillales"
[7] "Mollicutes_RF39"
Family
[1] "Bacteroidaceae" "Bifidobacteriaceae" "Clostridiaceae_1"
[4] "Erysipelotrichaceae" "Family_XIII" "Lachnospiraceae"
[7] "Lactobacillaceae" "Mollicutes_RF39_fa" "Muribaculaceae"
[10] "Rikenellaceae" "Ruminococcaceae"
Genus
[1] "Alistipes" "Bacteroides"
[3] "Bifidobacterium" "Candidatus_Arthromitus"
[5] "Escherichia.Shigella" "Family_XIII_ge"
[7] "Flavonifractor" "Fusicatenibacter"
[9] "Lachnoclostridium" "Lachnospiraceae_UCG.006"
[11] "Lactobacillus" "Mollicutes_RF39_ge"
[13] "Muribaculaceae_ge" "Oscillospira"
[15] "Roseburia" "Ruminococcaceae_ge"
[17] "Turicibacter"
Figure x. Rank abundance of eight samples. Package: goeveg
Figure x. Correlation between species identified at phylum-level. Species are ordered alphabetically (top panel) and heuristically (bottom panel)
Figure x: Stacked barplots for species richness. The estimated richness (green bars) was calculated using chao calculator and observed ichness (red bars) was calculated using sobs.
Figure x: Species richness (observed species) displayed by boxplot (A), density plots (B) and histograms (C).
Include are:
OTUbased
Number_clusters Value_Index
2.0000 21.0681
Phylum
Number_clusters Value_Index
2.0000 75.9057
Class
Number_clusters Value_Index
2.0000 50.5688
Order
Number_clusters Value_Index
2.0000 42.1392
Family
Number_clusters Value_Index
3.000 30.029
Genus
Number_clusters Value_Index
3.0000 20.7716
Figure x: Optimal number of OTU clusters. The suggested number of best clusters (dotted line) that could explain most variation is 2 for OTUs (A), 3 for phylum (B), 3 for class (C), 2 for Order (D), 10 for Family (E) and 2 for Genus (F). A high average Silhouette width indicates high quality clustering.
cluster size ave.sil.width
1 1 240 0.65
2 2 120 0.13
cluster size ave.sil.width
1 1 240 0.68
2 2 120 0.24
cluster size ave.sil.width
1 1 239 0.67
2 2 121 0.22
cluster size ave.sil.width
1 1 239 0.67
2 2 121 0.22
cluster size ave.sil.width
1 1 241 0.67
2 2 119 0.10
cluster size ave.sil.width
1 1 235 0.68
2 2 125 0.23
Figure x: Silhouette plot guided by the best number of clusters. Observations with a large Si (almost 1) are very well clustered. A small Si (around 0) means that the observation lies between two clusters while a negative Si are outliers.
Figure x: Scree plot of PCA. Shows which components explain most of the variability in the data. Over 80% of the variances contained in OTU and taxonomy data are retained by the first two principal components. The first PC explains the maximum amount of variation in the data set.
Square root transformation
Wisconsin double standardization
Run 0 stress 0.1320155
Run 1 stress 0.131795
... New best solution
... Procrustes: rmse 0.01737973 max resid 0.1721785
Run 2 stress 0.1308199
... New best solution
... Procrustes: rmse 0.03799815 max resid 0.2148348
Run 3 stress 0.1309166
... Procrustes: rmse 0.01927976 max resid 0.298676
Run 4 stress 0.1321164
Run 5 stress 0.1314121
Run 6 stress 0.1310729
... Procrustes: rmse 0.02276801 max resid 0.2876684
Run 7 stress 0.1314575
Run 8 stress 0.131648
Run 9 stress 0.1309433
... Procrustes: rmse 0.02510678 max resid 0.2631128
Run 10 stress 0.1306583
... New best solution
... Procrustes: rmse 0.01200417 max resid 0.1331687
Run 11 stress 0.1315689
Run 12 stress 0.1307628
... Procrustes: rmse 0.01844118 max resid 0.2927609
Run 13 stress 0.1305934
... New best solution
... Procrustes: rmse 0.0275882 max resid 0.2930939
Run 14 stress 0.1327801
Run 15 stress 0.1308671
... Procrustes: rmse 0.03324444 max resid 0.2022286
Run 16 stress 0.1318062
Run 17 stress 0.1312982
Run 18 stress 0.1310071
... Procrustes: rmse 0.02757587 max resid 0.2797129
Run 19 stress 0.1305531
... New best solution
... Procrustes: rmse 0.008945208 max resid 0.1223186
Run 20 stress 0.1311254
*** No convergence -- monoMDS stopping criteria:
14: no. of iterations >= maxit
6: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 0.121395
Run 1 stress 0.1226535
Run 2 stress 0.1218715
... Procrustes: rmse 0.01648084 max resid 0.1473789
Run 3 stress 0.1226283
Run 4 stress 0.1198947
... New best solution
... Procrustes: rmse 0.01205629 max resid 0.08810511
Run 5 stress 0.1204889
Run 6 stress 0.1211368
Run 7 stress 0.1243474
Run 8 stress 0.1214427
Run 9 stress 0.1208254
Run 10 stress 0.1199683
... Procrustes: rmse 0.009068018 max resid 0.09452289
Run 11 stress 0.1225659
Run 12 stress 0.1209852
Run 13 stress 0.1198246
... New best solution
... Procrustes: rmse 0.009211115 max resid 0.1108122
Run 14 stress 0.1214443
Run 15 stress 0.1212073
Run 16 stress 0.1207379
Run 17 stress 0.121377
Run 18 stress 0.1224281
Run 19 stress 0.1207474
Run 20 stress 0.1204284
*** No convergence -- monoMDS stopping criteria:
15: no. of iterations >= maxit
5: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 0.1422875
Run 1 stress 0.144234
Run 2 stress 0.1430387
Run 3 stress 0.1430329
Run 4 stress 0.1432225
Run 5 stress 0.1435328
Run 6 stress 0.1435968
Run 7 stress 0.1441753
Run 8 stress 0.1438901
Run 9 stress 0.1430608
Run 10 stress 0.1426484
... Procrustes: rmse 0.01034154 max resid 0.1433039
Run 11 stress 0.1434003
Run 12 stress 0.142515
... Procrustes: rmse 0.004608976 max resid 0.05952907
Run 13 stress 0.1444387
Run 14 stress 0.1435266
Run 15 stress 0.1446007
Run 16 stress 0.1436055
Run 17 stress 0.1435066
Run 18 stress 0.143452
Run 19 stress 0.1436154
Run 20 stress 0.1511232
*** No convergence -- monoMDS stopping criteria:
8: no. of iterations >= maxit
12: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 0.1355536
Run 1 stress 0.1355325
... New best solution
... Procrustes: rmse 0.00733575 max resid 0.09385795
Run 2 stress 0.1443072
Run 3 stress 0.1386428
Run 4 stress 0.135253
... New best solution
... Procrustes: rmse 0.008139862 max resid 0.1145029
Run 5 stress 0.1357678
Run 6 stress 0.1428662
Run 7 stress 0.1366844
Run 8 stress 0.1351755
... New best solution
... Procrustes: rmse 0.00813885 max resid 0.1237319
Run 9 stress 0.1382688
Run 10 stress 0.1355883
... Procrustes: rmse 0.007218701 max resid 0.07368605
Run 11 stress 0.1354217
... Procrustes: rmse 0.006385372 max resid 0.09222482
Run 12 stress 0.1355949
... Procrustes: rmse 0.007018928 max resid 0.09511988
Run 13 stress 0.1396045
Run 14 stress 0.1359311
Run 15 stress 0.1353594
... Procrustes: rmse 0.004678551 max resid 0.04709641
Run 16 stress 0.1360675
Run 17 stress 0.1365917
Run 18 stress 0.1354206
... Procrustes: rmse 0.005836932 max resid 0.06944482
Run 19 stress 0.1361279
Run 20 stress 0.1356546
... Procrustes: rmse 0.007671726 max resid 0.0759837
*** No convergence -- monoMDS stopping criteria:
17: no. of iterations >= maxit
3: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 0.1432815
Run 1 stress 0.1429304
... New best solution
... Procrustes: rmse 0.01350941 max resid 0.1931366
Run 2 stress 0.1459068
Run 3 stress 0.1429461
... Procrustes: rmse 0.007475489 max resid 0.09080252
Run 4 stress 0.1450437
Run 5 stress 0.1444651
Run 6 stress 0.1458861
Run 7 stress 0.1450571
Run 8 stress 0.1434276
... Procrustes: rmse 0.009615488 max resid 0.1266274
Run 9 stress 0.1432483
... Procrustes: rmse 0.008113319 max resid 0.115036
Run 10 stress 0.1439801
Run 11 stress 0.1441929
Run 12 stress 0.1435686
Run 13 stress 0.1429727
... Procrustes: rmse 0.006387782 max resid 0.09585337
Run 14 stress 0.145977
Run 15 stress 0.142738
... New best solution
... Procrustes: rmse 0.008637825 max resid 0.09908415
Run 16 stress 0.1488512
Run 17 stress 0.1453525
Run 18 stress 0.1429162
... Procrustes: rmse 0.006302938 max resid 0.07125081
Run 19 stress 0.143163
... Procrustes: rmse 0.008844548 max resid 0.1058869
Run 20 stress 0.1436694
*** No convergence -- monoMDS stopping criteria:
17: no. of iterations >= maxit
3: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 0.1483427
Run 1 stress 0.1481365
... New best solution
... Procrustes: rmse 0.03931585 max resid 0.6391495
Run 2 stress 0.1477953
... New best solution
... Procrustes: rmse 0.03100381 max resid 0.4892752
Run 3 stress 0.1496372
Run 4 stress 0.1495392
Run 5 stress 0.1480876
... Procrustes: rmse 0.01591379 max resid 0.249
Run 6 stress 0.1480336
... Procrustes: rmse 0.02721307 max resid 0.4228015
Run 7 stress 0.1479726
... Procrustes: rmse 0.01052854 max resid 0.1656784
Run 8 stress 0.150896
Run 9 stress 0.1482884
... Procrustes: rmse 0.022387 max resid 0.33595
Run 10 stress 0.1482097
... Procrustes: rmse 0.03535344 max resid 0.564696
Run 11 stress 0.1480284
... Procrustes: rmse 0.02688119 max resid 0.4016029
Run 12 stress 0.1485284
Run 13 stress 0.1490377
Run 14 stress 0.1479164
... Procrustes: rmse 0.009144612 max resid 0.137554
Run 15 stress 0.1482056
... Procrustes: rmse 0.03356324 max resid 0.5510062
Run 16 stress 0.1488457
Run 17 stress 0.1480399
... Procrustes: rmse 0.01489924 max resid 0.2183789
Run 18 stress 0.147993
... Procrustes: rmse 0.02910497 max resid 0.4506213
Run 19 stress 0.1490598
Run 20 stress 0.147806
... Procrustes: rmse 0.01093966 max resid 0.1569197
*** No convergence -- monoMDS stopping criteria:
11: no. of iterations >= maxit
9: stress ratio > sratmax
OTUs
----------------------------
Call:
metaMDS(comm = otu.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(otu.t))
Distance: bray
Dimensions: 3
Stress: 0.1305531
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(otu.t))'
Phylum
----------------------------
Call:
metaMDS(comm = phylum.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(phylum.t))
Distance: bray
Dimensions: 3
Stress: 0.1198246
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(phylum.t))'
Class
----------------------------
Call:
metaMDS(comm = class.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(class.t))
Distance: bray
Dimensions: 3
Stress: 0.1422875
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(class.t))'
Order
----------------------------
Call:
metaMDS(comm = order.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(order.t))
Distance: bray
Dimensions: 3
Stress: 0.1351755
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(order.t))'
Family
----------------------------
Call:
metaMDS(comm = family.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(family.t))
Distance: bray
Dimensions: 3
Stress: 0.142738
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(family.t))'
Genus
----------------------------
Call:
metaMDS(comm = genus.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(genus.t))
Distance: bray
Dimensions: 3
Stress: 0.1477953
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(genus.t))'
Figure X. Sherperd and non-metric multidimensional scaling plot. Green oints represent samples and red points represent OTU or species. Similar samples are ordinated together. Stress values are shown at the botthom of ordination plot.
Figure x: The circular phylograms (A), unrooted cladogram (B), and the rectangular phylograms (C) display the relationships of the 360 samples used in the case study. Female (red) and male (blue) linked with sequence counts showing the proportion of the number of classified (green) and unclassified (red) displayed on a pie chart followed by the phyla abundance (heatmap) and species richness bar chart showing the observed (green) and estimated (maroon) richness. A portion of the tree (D) is enlarged to show some details.
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] factoextra_1.0.5 cluster_2.0.7-1 NbClust_3.0 iNEXT_2.0.19
[5] goeveg_0.4.2 vegan_2.5-4 permute_0.9-4 scales_1.0.0
[9] ggpubr_0.2 magrittr_1.5 dplyr_0.8.0.1 reshape2_1.4.3
[13] funModeling_1.7 Hmisc_4.2-0 ggplot2_3.1.0 Formula_1.2-3
[17] survival_2.43-3 lattice_0.20-38
loaded via a namespace (and not attached):
[1] colorspace_1.4-1 htmlTable_1.13.1 XVector_0.22.0
[4] base64enc_0.1-3 rstudioapi_0.10 ggrepel_0.8.0
[7] xml2_1.2.0 leaps_3.0 codetools_0.2-16
[10] splines_3.5.2 knitr_1.22 ade4_1.7-13
[13] spam_2.2-1 jsonlite_1.6 entropy_1.2.1
[16] phyloseq_1.26.1 readr_1.3.1 compiler_3.5.2
[19] backports_1.1.3 assertthat_0.2.1 Matrix_1.2-15
[22] lazyeval_0.2.1 acepack_1.4.1 htmltools_0.3.6
[25] tools_3.5.2 igraph_1.2.4 dotCall64_1.0-0
[28] NLP_0.2-0 gtable_0.2.0 glue_1.3.1
[31] FactoMineR_1.41 maps_3.3.0 Rcpp_1.0.1
[34] slam_0.1-45 Biobase_2.42.0 Biostrings_2.50.2
[37] multtest_2.38.0 gdata_2.18.0 ape_5.2
[40] nlme_3.1-137 iterators_1.0.10 xfun_0.6
[43] stringr_1.4.0 gtools_3.8.1 zlibbioc_1.28.0
[46] MASS_7.3-51.1 hms_0.4.2 parallel_3.5.2
[49] biomformat_1.10.1 rhdf5_2.26.2 RColorBrewer_1.1-2
[52] fields_9.6 yaml_2.2.0 gridExtra_2.3
[55] pander_0.6.3 rpart_4.1-13 latticeExtra_0.6-28
[58] stringi_1.4.3 S4Vectors_0.20.1 corrplot_0.84
[61] foreach_1.4.4 checkmate_1.9.1 caTools_1.17.1.1
[64] BiocGenerics_0.28.0 rlang_0.3.4 pkgconfig_2.0.2
[67] moments_0.14 bitops_1.0-6 evaluate_0.13
[70] ROCR_1.0-7 purrr_0.3.2 Rhdf5lib_1.4.3
[73] htmlwidgets_1.3 labeling_0.3 cowplot_0.9.4
[76] tidyselect_0.2.5 plyr_1.8.4 R6_2.4.0
[79] IRanges_2.16.0 gplots_3.0.1.1 pillar_1.3.1
[82] foreign_0.8-71 withr_2.1.2 mgcv_1.8-27
[85] scatterplot3d_0.3-41 nnet_7.3-12 tibble_2.1.1
[88] crayon_1.3.4 KernSmooth_2.23-15 microbiome_1.4.2
[91] rmarkdown_1.12 data.table_1.12.0 flashClust_1.01-2
[94] digest_0.6.18 tm_0.7-6 tidyr_0.8.3
[97] stats4_3.5.2 munsell_0.5.0